Special Sessions

We are pleased to announce the following accepted Special Sessions for PCS2021.


Special Sessions

Special Session 1

Perceptually Driven Techniques for Video Compression and Quality Assessment

Session organizers: Christos Bampis (Netflix, USA), Lukas Krasula (Netflix, USA), Zhi Li (Netflix, USA).

One of the most important aspects of picture coding is its impact on human perception, as the human eye is the ultimate receiver in a variety of picture-related applications. Recent advances in perceptual quality assessment and video compression have provided tools that enabled substantial improvements in areas such as video streaming, teleconferencing, etc. The rapid increase in internet traffic caused by the global pandemic clearly demonstrated the importance of such technology, as well as identified room for improvement in managing the data load. The goal of this special session is to provide a forum for sharing and discussing cutting-edge research in Video Compression and Quality Assessment inspired by perceptual aspects of the human visual system. Possible topics that are a good fit for this session include but are not limited to:

  • Perceptual-optimization of deep-network based approaches for video compression and motion estimation.
  • Inventive ways of capturing and/or mitigating visual artifacts under challenging conditions, such as low-light scenes, newer codecs, or interaction of multiple distortions.
  • Novel tools for video compression enhancing the perceptual quality, such as in-loop filters, or other pre-processing or post-processing techniques.
  • Design of novel objective video quality metrics that can also be applicable to any of the above.
Special Session 2

Learning-based Image Coding

Session organizers: João Ascenso (Instituto Superior Técnico, Portugal), Fernando Pereira (Instituto Superior Técnico, Portugal) and Touradj Ebrahimi (École Polytechnique Fédérale de Lausanne, Switzerland).

Image coding algorithms create compact representations of an image by exploiting spatial redundancies and perceptual irrelevance, thus exploiting the characteristics of the human visual system. Recently, data driven algorithms such as neural networks have attracted a lot of attention and have become a popular area of research and development. This interest is driven by several factors, such as recent advances in processing power (cheap and powerful hardware), the availability of large data sets (big data) and several algorithmic and architectural advances (e.g. generative adversarial networks).

Nowadays, machine learning through neural networks is the state-of-the-art for several computer vision tasks, such as those requiring high-level understanding of content semantics, e.g. image classification, object segmentation, saliency detection, but also low-level image processing tasks, such as image denoising, inpainting and super-resolution. These advances have led to an increased interest in applying deep neural networks to image coding, which is the main focus of the JPEG AI activity within the JPEG standardization committee. The aim of these novel image coding solutions is to design a compact image representation model that has been obtained (learned) from a large amount of visual data and can efficiently represent the wide variety of visual content that is available today. Some of the early learning-based image coding solutions already show encouraging results in terms of rate-distortion (RD) performance, notably in comparison to conventional standard image coding (e.g. JPEG 2000 and HEVC Intra) which compress images with hand- crafted transforms, entropy coding and quantization schemes.

This proposal for a PCS 2021 Special Session on Learning-based Image Coding gathers technical contributions that demonstrate the efficient coding of image content based on a learning-based approach. This topic has received many contributions in recent years and is considered critical to the future of image coding, especially the solutions for which learning-based tools substitute the previous conventional architectures, adopting end-to-end training. This special session proposal collects a wide range of contributions on this topic, namely, non-linear data transformations, probability models for entropy coding, block-based coding structures, rate-allocation procedures, intra prediction tools, prediction filters and complexity optimizations. Moreover, a recent subjective quality study performed in the context of JPEG AI is also included, where relevant learning-based image coding solutions are assessed, and the potential of this novel coding approach is analyzed.

Special Session 3

AI for VVC optimisations and enhancements

Session organizers: M. Abdoli (ATEME, France), T. Biatek (ATEME, France), E. François (InterDigital, France), and W. Hamidouche (IETR, France).

The next generation video coding standard named Versatile Video Coding (VVC) developed by the Joint Video Experts Team (JVET) jointly established by MPEG/ISO and VCEG/ITU has reached the Final Draft International Standard (FDIS) stage in July 2020. VVC introduces numerous new coding tools enabling around 50% bitrate reduction for the same subjective quality with respect to its predecessor standard HEVC. VVC is intended to facilitate the deployment of new video formats and applications such as omnidirectional (360°), High Dynamic Range, cloud gaming, screen content and 8K resolution content distribution which require special attention for compression and transmission. However, VVC complexity has significantly increased compared to HEVC and thus encoding complexity needs to be carefully addressed to take advantage of the VVC coding efficiency in real time applications.

On the other hand, recent advances in Artificial Intelligence (AI) have brought outstanding performance in many computer vision applications including image super resolution and quality enhancement. These machine learning and deep learning approaches can be considered to tackle the problems of quality enhancement of the decoded VVC video and its complexity increase. They are also being considered by JVET and MPEG as potential tracks to enhance the VVC standard with new NN-based coding tools. This special session will focus on deep learning-based solutions for VVC video quality enhancement and complexity reduction performed as pre-processing, post-processing or inside the video codec.

Topics or relevance are include but are not limited to:

  • AI for VVC pre and post-processing
  • AI-driven VVC encoding decision
  • AI-based coding tools for possible VVC extensions
  • AI for VVC compression performance improvement
Special Session 4

Video encoding for large scale HAS deployments

Session organizers: Christian Timmerer (Bitmovin, Austria), Mohammad Ghanbari (University of Essex, UK), and Alex Giladi (Comcast, USA).

Video accounts for the vast majority of today’s internet traffic and video coding is vital for efficient distribution towards the end-user. Software- or/and cloud-based video coding is becoming more and more attractive, specifically with the plethora of video codecs available right now (e.g., AVC, HEVC, VVC, VP9, AV1, etc.) which is also supported by the latest Bitmovin Video Developer Report 2020 [1]. Thus, improvements in video coding enabling efficient adaptive video streaming is a requirement for current and future video services. HTTP Adaptive Streaming (HAS) is now mainstream due to its simplicity, reliability, and standard support (e.g., MPEG-DASH). For HAS, the video is usually encoded in multiple versions (i.e., representations) of different resolutions, bitrates, codecs, etc. and each representation is divided into chunks (i.e., segments) of equal length (e.g., 2-10 sec) to enable dynamic, adaptive switching during streaming based on the user’s context conditions (e.g., network conditions, device characteristics, user preferences). In this context, most scientific papers in the literature target various improvements which are evaluated based on open, standard test sequences. We argue that optimizing video encoding for large scale HAS deployments is the next step in order to improve the Quality of Experience (QoE), while optimizing costs.

  1. https://bitmovin.com/
  2. https://doi.org/10.1109/COMST.2018.2862938
Special Session 5

Coding and quality evaluation of light-fields

Session organizers: Frederic Dufaux (CNRS, France) and Rafal Mantiuk (University of Cambridge, UK).

Recent advances in acquisition technologies such as plenoptic cameras and camera arrays make light-field content creation more accessible. On another front, advances in display technologies such as light-field displays provide glasses-free 3D viewing experience to the users. Such technological trends open the door to a more realistic visual experience. Although three-dimensional data can be represented in various ways, light-field representations gained more popularity and attracted the attention of the scientific community. Since light-field allows us to represent the three-dimensional scene as a collection of two-dimensional images, it makes it possible to utilize and extend the current theoretical and practical knowledge on two-dimensional images to three-dimensional space. In addition, light-field content can be converted to other three-dimensional representations such as point-clouds, multi-plane images, holograms, signed distance fields or neural representations. With the increased popularity of these high dimensional data, coding algorithms and evaluation of coding-related algorithms gained importance. The purpose of the proposed special session is to promote further research on the compression of light-field content and quality evaluation of artefacts resulting from such compression. We aim to highlight recent advances in the field and provide a venue for discussion.

Header image courtesy of Destination Bristol