Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting

Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.

doi:10.1109/TIP.2020.3043122

The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales. Furthermore, to improve the evaluation accuracy of the column with a large receptive field, we propose a transform dilated convolution. The transform dilated convolution breaks the fixed sampling structure of the deep network. Moreover, it needs no extra parameters and training, and the offsets are constrained in a local region, which is designed for the congested scenes. The proposed method achieves state-of-the-art performance on five datasets (ShanghaiTech, UCF CC 50, WorldEXPO'10, UCSD, and TRANCOS).

Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting / Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 30:(2021), pp. 1395-1407. [10.1109/TIP.2020.3043122]

Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting

Yang Y.;Li G.;Du D.;Huang Q.;Sebe N.

2021-01-01

Abstract

The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales. Furthermore, to improve the evaluation accuracy of the column with a large receptive field, we propose a transform dilated convolution. The transform dilated convolution breaks the fixed sampling structure of the deep network. Moreover, it needs no extra parameters and training, and the offsets are constrained in a local region, which is designed for the congested scenes. The proposed method achieves state-of-the-art performance on five datasets (ShanghaiTech, UCF CC 50, WorldEXPO'10, UCSD, and TRANCOS).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2021
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON IMAGE PROCESSING
			
	DOI
	
				https://dx.doi.org/10.1109/TIP.2020.3043122
			
	Codice PubMed (PubMed Identifier)
	
				33315562
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85099069335
			
	Codice WOS (WOS identifier)
	
				WOS:000604831700002
			
	Tutti gli autori
	
						Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.
					
	Citazione
	
				Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting / Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 30:(2021), pp. 1395-1407. [10.1109/TIP.2020.3043122]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
Embedding_Perspective_Analysis_Into_Multi-Column_Convolutional_Neural_Network_for_Crowd_Counting.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 4.48 MB Formato Adobe PDF Visualizza/Apri	4.48 MB	Adobe PDF	Visualizza/Apri