In this paper, we propose a novel temporal spiking recurrent neural network (TSRNN) to perform robust action recognition in videos. The proposed TSRNN employs a novel spiking architecture which utilizes the local discriminative features from high-confidence reliable frames as spiking signals. The conventional CNN-RNNs typically used for this problem treat all the frames equally important such that they are error-prone to noisy frames. The TSRNN solves this problem by employing a temporal pooling architecture which can help RNN select sparse and reliable frames and enhances its capability in modelling long-range temporal information. Besides, a message passing bridge is added between the spiking signals and the recurrent unit. In this way, the spiking signals can guide RNN to correct its long-term memory across multiple frames from contamination caused by noisy frames with distracting factors (e.g., occlusion, rapid scene transition). With these two novel components, TSRNN achieves competitive performance compared with the state-of-the-art CNN-RNN architectures on two large scale public benchmarks, UCF101 and HMDB51.

Temporal Spiking Recurrent Neural Network for Action Recognition / Wang, Wei; Hao, Siyuan; Wei, Yunchao; Xiao, Shengtao; Feng, Jiashi; Sebe, Nicu. - In: IEEE ACCESS. - ISSN 2169-3536. - 7:(2019), pp. 117165-117175. [10.1109/ACCESS.2019.2936604]

Temporal Spiking Recurrent Neural Network for Action Recognition

Wang, Wei;Sebe, Nicu
2019-01-01

Abstract

In this paper, we propose a novel temporal spiking recurrent neural network (TSRNN) to perform robust action recognition in videos. The proposed TSRNN employs a novel spiking architecture which utilizes the local discriminative features from high-confidence reliable frames as spiking signals. The conventional CNN-RNNs typically used for this problem treat all the frames equally important such that they are error-prone to noisy frames. The TSRNN solves this problem by employing a temporal pooling architecture which can help RNN select sparse and reliable frames and enhances its capability in modelling long-range temporal information. Besides, a message passing bridge is added between the spiking signals and the recurrent unit. In this way, the spiking signals can guide RNN to correct its long-term memory across multiple frames from contamination caused by noisy frames with distracting factors (e.g., occlusion, rapid scene transition). With these two novel components, TSRNN achieves competitive performance compared with the state-of-the-art CNN-RNN architectures on two large scale public benchmarks, UCF101 and HMDB51.
2019
Wang, Wei; Hao, Siyuan; Wei, Yunchao; Xiao, Shengtao; Feng, Jiashi; Sebe, Nicu
Temporal Spiking Recurrent Neural Network for Action Recognition / Wang, Wei; Hao, Siyuan; Wei, Yunchao; Xiao, Shengtao; Feng, Jiashi; Sebe, Nicu. - In: IEEE ACCESS. - ISSN 2169-3536. - 7:(2019), pp. 117165-117175. [10.1109/ACCESS.2019.2936604]
File in questo prodotto:
File Dimensione Formato  
08808849.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 2.06 MB
Formato Adobe PDF
2.06 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/250807
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 13
social impact