CNNs for JPEGs: Designing Cost-Efficient Stems

Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy

doi:10.5753/jbcs.2026.5873

Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score SE to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.

Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model’s computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score SE to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.

CNNs for JPEGs: Designing Cost-Efficient Stems / Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy. - In: JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY. - ISSN 0104-6500. - 32:1(2026), pp. 201-215. [10.5753/jbcs.2026.5873]

CNNs for JPEGs: Designing Cost-Efficient Stems

Dos Santos, Samuel Felipe;Sebe, Nicu;Almeida, Jurandy

2026-01-01

Abstract

Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model’s computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score SE to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2026
			
	Titolo del periodico (Journal title)
	
				JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY
			
	Numero e parte del fascicolo (Issue number and part)
	
				1
			
	DOI
	
				https://dx.doi.org/10.5753/jbcs.2026.5873
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-105033484283
			
	Tutti gli autori
	
						Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy
					
	Citazione
	
				CNNs for JPEGs: Designing Cost-Efficient Stems / Dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy. - In: JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY. - ISSN 0104-6500. - 32:1(2026), pp. 201-215. [10.5753/jbcs.2026.5873]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
5873-Article Text-37000-1-10-20260302.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 464.34 kB Formato Adobe PDF Visualizza/Apri	464.34 kB	Adobe PDF	Visualizza/Apri