Over the past decade, convolutional neural networks (CNNs) have achieved state-of-the-art performance in many computer vision tasks. They can learn robust representations of image data by processing RGB pixels. Since image data are often stored in a compressed format, from which JPEG is the most widespread, a preliminary decoding process is demanded. Recently, the design of CNNs for processing JPEG compressed data has gained attention from the research community. They process DCT coefficients instead of RGB pixels, saving computation for decoding JPEG images, however, at the cost of increasing the computational complexity of the network. In this paper, we examine how spatial resolution and JPEG quality impacts on the performance of a state-of-the-art CNN designed to operate directly on the JPEG compressed domain. To alleviate its computational complexity, we propose a Frequency Band Selection (FBS) technique to select the most relevant DCT coefficients before feeding them to the network. Experiments were conducted on a subset of the ImageNet dataset considering both fine- and coarse-grained image classification tasks. Results show that such networks are resilient to JPEG quality but are susceptible to spatial resolution. Also, our FBS can reduce the computational complexity of the network while retaining a similar accuracy.

The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG / Santos, Samuel Felipe dos; Sebe, Nicu; Almeida, Jurandy. - (2020), pp. 1896-1900. (Intervento presentato al convegno 2020 IEEE International Conference on Image Processing (ICIP) tenutosi a Abu Dhabi nel 25th-28th October 2020) [10.1109/ICIP40778.2020.9190741].

The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG

Sebe, Nicu;
2020-01-01

Abstract

Over the past decade, convolutional neural networks (CNNs) have achieved state-of-the-art performance in many computer vision tasks. They can learn robust representations of image data by processing RGB pixels. Since image data are often stored in a compressed format, from which JPEG is the most widespread, a preliminary decoding process is demanded. Recently, the design of CNNs for processing JPEG compressed data has gained attention from the research community. They process DCT coefficients instead of RGB pixels, saving computation for decoding JPEG images, however, at the cost of increasing the computational complexity of the network. In this paper, we examine how spatial resolution and JPEG quality impacts on the performance of a state-of-the-art CNN designed to operate directly on the JPEG compressed domain. To alleviate its computational complexity, we propose a Frequency Band Selection (FBS) technique to select the most relevant DCT coefficients before feeding them to the network. Experiments were conducted on a subset of the ImageNet dataset considering both fine- and coarse-grained image classification tasks. Results show that such networks are resilient to JPEG quality but are susceptible to spatial resolution. Also, our FBS can reduce the computational complexity of the network while retaining a similar accuracy.
2020
2020 IEEE International Conference on Image Processing Proceedings
Piscataway, NJ
IEEE
978-1-7281-6395-6
Santos, Samuel Felipe dos; Sebe, Nicu; Almeida, Jurandy
The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG / Santos, Samuel Felipe dos; Sebe, Nicu; Almeida, Jurandy. - (2020), pp. 1896-1900. (Intervento presentato al convegno 2020 IEEE International Conference on Image Processing (ICIP) tenutosi a Abu Dhabi nel 25th-28th October 2020) [10.1109/ICIP40778.2020.9190741].
File in questo prodotto:
File Dimensione Formato  
09190741.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 95.05 kB
Formato Adobe PDF
95.05 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/284551
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 8
social impact