Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score SE to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.

CNNs for JPEGs: Designing Cost-Efficient Stems / Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy. - In: JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY. - ISSN 0104-6500. - 32:1(2026), pp. 201-215. [10.5753/jbcs.2026.5873]

CNNs for JPEGs: Designing Cost-Efficient Stems

Sebe, Nicu;
2026-01-01

Abstract

Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score SE to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.
2026
1
Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy
CNNs for JPEGs: Designing Cost-Efficient Stems / Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy. - In: JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY. - ISSN 0104-6500. - 32:1(2026), pp. 201-215. [10.5753/jbcs.2026.5873]
File in questo prodotto:
File Dimensione Formato  
5873-Article Text-37000-1-10-20260302.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 464.34 kB
Formato Adobe PDF
464.34 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/481490
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact