With the increasing popularity of Unmanned Aerial Vehicles (UAVs) in computer vision-related applications, intelligent UAV video analysis has recently attracted the attention of an increasing number of researchers. To facilitate research in the UAV field, this paper presents a UAV dataset with 100 videos featuring approximately 2700 vehicles recorded under unconstrained conditions and 840k manually annotated bounding boxes. These UAV videos were recorded in complex real-world scenarios and pose significant new challenges, such as complex scenes, high density, small objects, and large camera motion, to the existing object detection and tracking methods. These challenges have encouraged us to define a benchmark for three fundamental computer vision tasks, namely, object detection, single object tracking (SOT) and multiple object tracking (MOT), on our UAV dataset. Specifically, our UAV benchmark facilitates evaluation and detailed analysis of state-of-the-art detection and tracking methods on the proposed UAV dataset. Furthermore, we propose a novel approach based on the so-called Context-aware Multi-task Siamese Network (CMSN) model that explores new cues in UAV videos by judging the consistency degree between objects and contexts and that can be used for SOT and MOT. The experimental results demonstrate that our model could make tracking results more robust in both SOT and MOT, showing that the current tracking and detection methods have limitations in dealing with the proposed UAV benchmark and that further research is indeed needed.

The Unmanned Aerial Vehicle Benchmark: Object Detection, Tracking and Baseline / Yu, Hongyang; Li, Gr; Zhang, Weigang; Huang, Qingming; Du, Dawei; Tian, Qi; Sebe, Nicu. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - 128:5(2020), pp. 1141-1159. [10.1007/s11263-019-01266-1]

The Unmanned Aerial Vehicle Benchmark: Object Detection, Tracking and Baseline

Sebe, Nicu
2020-01-01

Abstract

With the increasing popularity of Unmanned Aerial Vehicles (UAVs) in computer vision-related applications, intelligent UAV video analysis has recently attracted the attention of an increasing number of researchers. To facilitate research in the UAV field, this paper presents a UAV dataset with 100 videos featuring approximately 2700 vehicles recorded under unconstrained conditions and 840k manually annotated bounding boxes. These UAV videos were recorded in complex real-world scenarios and pose significant new challenges, such as complex scenes, high density, small objects, and large camera motion, to the existing object detection and tracking methods. These challenges have encouraged us to define a benchmark for three fundamental computer vision tasks, namely, object detection, single object tracking (SOT) and multiple object tracking (MOT), on our UAV dataset. Specifically, our UAV benchmark facilitates evaluation and detailed analysis of state-of-the-art detection and tracking methods on the proposed UAV dataset. Furthermore, we propose a novel approach based on the so-called Context-aware Multi-task Siamese Network (CMSN) model that explores new cues in UAV videos by judging the consistency degree between objects and contexts and that can be used for SOT and MOT. The experimental results demonstrate that our model could make tracking results more robust in both SOT and MOT, showing that the current tracking and detection methods have limitations in dealing with the proposed UAV benchmark and that further research is indeed needed.
2020
5
Yu, Hongyang; Li, Gr; Zhang, Weigang; Huang, Qingming; Du, Dawei; Tian, Qi; Sebe, Nicu
The Unmanned Aerial Vehicle Benchmark: Object Detection, Tracking and Baseline / Yu, Hongyang; Li, Gr; Zhang, Weigang; Huang, Qingming; Du, Dawei; Tian, Qi; Sebe, Nicu. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - 128:5(2020), pp. 1141-1159. [10.1007/s11263-019-01266-1]
File in questo prodotto:
File Dimensione Formato  
Yu2020_Article_TheUnmannedAerialVehicleBenchm.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 4.06 MB
Formato Adobe PDF
4.06 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/266631
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 129
  • ???jsp.display-item.citation.isi??? 114
  • OpenAlex ND
social impact