The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales. Furthermore, to improve the evaluation accuracy of the column with a large receptive field, we propose a transform dilated convolution. The transform dilated convolution breaks the fixed sampling structure of the deep network. Moreover, it needs no extra parameters and training, and the offsets are constrained in a local region, which is designed for the congested scenes. The proposed method achieves state-of-the-art performance on five datasets (ShanghaiTech, UCF CC 50, WorldEXPO'10, UCSD, and TRANCOS).

Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting / Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 30:(2021), pp. 1395-1407. [10.1109/TIP.2020.3043122]

Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting

Sebe N.
2021-01-01

Abstract

The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales. Furthermore, to improve the evaluation accuracy of the column with a large receptive field, we propose a transform dilated convolution. The transform dilated convolution breaks the fixed sampling structure of the deep network. Moreover, it needs no extra parameters and training, and the offsets are constrained in a local region, which is designed for the congested scenes. The proposed method achieves state-of-the-art performance on five datasets (ShanghaiTech, UCF CC 50, WorldEXPO'10, UCSD, and TRANCOS).
2021
Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.
Embedding Perspective Analysis into Multi-Column Convolutional Neural Network for Crowd Counting / Yang, Y.; Li, G.; Du, D.; Huang, Q.; Sebe, N.. - In: IEEE TRANSACTIONS ON IMAGE PROCESSING. - ISSN 1057-7149. - 30:(2021), pp. 1395-1407. [10.1109/TIP.2020.3043122]
File in questo prodotto:
File Dimensione Formato  
Embedding_Perspective_Analysis_Into_Multi-Column_Convolutional_Neural_Network_for_Crowd_Counting.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 4.48 MB
Formato Adobe PDF
4.48 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/326158
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 19
social impact