Deep RL for UAV Energy and Coverage Optimization in 6G-Based IoT Remote Sensing Networks

Worku, Y. M.; Tshakwanda, P. M.; Tsegaye, H. B.; Devetsikiotis, M.; Sacchi, C.; Christodoulou, C. G.

doi:10.1109/AERO63441.2025.11068737

The rapid evolution of Internet of things (IoT) strategies and networking capabilities has catalyzed a pressing need for intelligent and efficient data collection across various domains. This demand is particularly acute in the context of 6G-based IoT remote sensing networks, where the integration of Unmanned Aerial Vehicles (UAVs) emerges as a transformative solution. UAVs offer unprecedented capabilities for enhancing environmental observation, data collection, and communication efficiency in dynamic and challenging environments. However, optimizing their energy consumption and coverage remains a formidable challenge, especially in scenarios requiring realtime data acquisition and transmission. This paper addresses these challenges by introducing novel reinforcement learning (RL) techniques tailored to the unique demands of 6G and Non-Terrestrial Networks (NTN). The methodology integrates state-of-the-art deep learning algorithms designed to optimize strategic data collection via UAVs. Specifically, advanced RL approaches including Q-learning, Double Deep Q-Network (DDQN), Advantage Actor-Critic (A2C) and Long short-term memory (LSTM) based Advantage Actor-Critic (LSTM-A2C) method are implemented and rigorously compared under diverse operational conditions. These conditions encompass random starting positions, wind interference, and varying environmental complexities, ensuring robust performance assessment. The study focuses on a UAV surveillance scenario where ground images are captured to collect sensor data on individuals. The primary objective is to derive an optimal trajectory from takeoff to landing locations, maximizing the detection of individuals in captured images. This problem formulation is modeled as an episodic RL task, emphasizing both efficiency in data acquisition and energy conservation. Notably, the research aims to ensure that UAVs reach their designated landing sites before battery depletion while maximizing user coverage. Key findings highlight the superior effectiveness of the proposed RL techniques, with the LSTM-A2C method exhibiting faster convergence and enhanced performance in dynamic environments. Despite occasional challenges in reproducibility across diverse environmental conditions, detailed analyses reveal insights into refining reward structures and optimizing hyperparameters, essential for enhancing algorithm robustness and deployment feasibility in real-world applications. This study significantly advances the development of robust and efficient RL-based strategies tailored for UAV operations within dynamic 6G-based IoT networks. Beyond addressing fundamental systems engineering challenges, such as energy efficiency and trajectory optimization, the research underscores innovative applications in target tracking and environmental monitoring. The implications extend to sectors requiring reliable, high-fidelity data collection, ranging from disaster response and surveillance to precision agriculture and infrastructure inspection. By integrating advanced RL methodologies, including LSTM-A2C, this research represents a critical step towards unlocking the full potential of UAVs in modern IoT ecosystems, bridging theoretical advancements with practical deployment strategies and contributing to future technologies that are both adaptive and resilient in complex operational environments.

Deep RL for UAV Energy and Coverage Optimization in 6G-Based IoT Remote Sensing Networks / Worku, Y. M.; Tshakwanda, P. M.; Tsegaye, H. B.; Devetsikiotis, M.; Sacchi, C.; Christodoulou, C. G.. - ELETTRONICO. - (2025), pp. 1-14. ( IEEE Aerospace Conference 2025 Big Sky, MT Marzo 2025) [10.1109/AERO63441.2025.11068737].