Multi-objective autotuning of mobile nets across the full software/hardware stack

IRIS

We present a customizable Collective Knowledge workflow to study the execution time vs. accuracy trade-offs for the MobileNets CNN family. We use this workflow to evaluate MobileNets on Arm Cortex CPUs using TensorFlow and Arm Mali GPUs using several versions of the Arm Compute Library. Our optimizations for the Arm Bifrost GPU architecture reduce the execution time by 2-3 times, while lying on a Pareto-optimal frontier. We also highlight the challenge of maintaining the accuracy when deploying CNN models across diverse platforms. We make all the workflow components (models, programs, scripts, etc.) publicly available to encourage further exploration by the community.

Multi-objective autotuning of mobile nets across the full software/hardware stack / Lokhmotov, A.; Vella, F.; Chunosov, N.; Fursin, G.. - ELETTRONICO. - (2018), p. 1. (Intervento presentato al convegno 1st ACM ReQuEST Workshop/Tournament on Reproducible Software/Hardware Co-Design of Pareto-Efficient Deep Learning, ReQuEST 2018 tenutosi a Williamsburg, VA, USA nel March 24th – March 28th 2018) [10.1145/3229762.3229767].

Multi-objective autotuning of mobile nets across the full software/hardware stack

Lokhmotov A.;Vella F.;Chunosov N.;Fursin G.

2018-01-01

Abstract

We present a customizable Collective Knowledge workflow to study the execution time vs. accuracy trade-offs for the MobileNets CNN family. We use this workflow to evaluate MobileNets on Arm Cortex CPUs using TensorFlow and Arm Mali GPUs using several versions of the Arm Compute Library. Our optimizations for the Arm Bifrost GPU architecture reduce the execution time by 2-3 times, while lying on a Pareto-optimal frontier. We also highlight the challenge of maintaining the accuracy when deploying CNN models across diverse platforms. We make all the workflow components (models, programs, scripts, etc.) publicly available to encourage further exploration by the community.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2018
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 1st Reproducible Quality-Efficient Systems Tournament on Co-Designing Pareto-Efficient Deep Learning, ReQuEST 2018 - Co-located with ACM ASPLOS 2018
			
	Luogo di edizione (Place of publication)
	
				New York, USA
			
	Casa editrice (Publisher)
	
				Association for Computing Machinery, Inc
			
	ISBN
	
				9781450359238
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85050640500
			
	Codice WOS (WOS identifier)
	
				WOS:000491870400005
			
	Tutti gli autori
	
						Lokhmotov, A.; Vella, F.; Chunosov, N.; Fursin, G.
					
	Citazione
	
				Multi-objective autotuning of mobile nets across the full software/hardware stack / Lokhmotov, A.; Vella, F.; Chunosov, N.; Fursin, G.. - ELETTRONICO. - (2018), p. 1. (Intervento presentato al  convegno 1st ACM ReQuEST Workshop/Tournament on Reproducible Software/Hardware Co-Design of Pareto-Efficient Deep Learning, ReQuEST 2018 tenutosi a Williamsburg, VA, USA nel March 24th – March 28th 2018) [10.1145/3229762.3229767].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/332800

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

9

3

ND

social impact