This paper presents a new setting for visual question answering (VQA) called personalized federated VQA (FedVQA) that addresses the growing need for decentralization and data privacy protection. FedVQA is both practical and challenging, requiring clients to learn well-personalized models on scene-specific datasets with severe feature/label distribution skews. These models then collaborate to optimize a generic global model on a central server, which is desired to generalize well on both seen and unseen scenes without sharing raw data with the server and other clients. The primary challenge of FedVQA is that, client models tend to forget the global knowledge initialized from central server during the personalized training, which impairs their personalized capacity due to the potential overfitting issue on local data. This further leads to divergence issues when aggregating distinct personalized knowledge at the central server, resulting in an inferior generalization ability on unseen scenes. To address the challenge, we propose a novel federated pairwise preference preserving (FedP3 ) framework to improve personalized learning via preserving generic knowledge under FedVQA constraints. Specifically, we first design a differentiable pairwise preference (DPP) to improve knowledge preserving by formulating a flexible yet effective global knowledge. Then, we introduce a forgotten-knowledge filter (FKF) to encourage the client models to selectively consolidate easily-forgotten knowledge. By aggregating the DPP and the FKF, FedP3 coordinates the generic and the personalized knowledge to enhance the personalized ability of clients and generalizability of the server. Extensive experiments show that FedP3 consistently surpasses the state-of-the-art in FedVQA task.

FedVQA: Personalized Federated Visual Question Answering over Heterogeneous Scenes / Lao, Mingrui; Pu, Nan; Zhong, Zhun; Sebe, Nicu; Lew, Michael S.. - (2023), pp. 7796-7807. (Intervento presentato al convegno ACM Multimedia tenutosi a Ottawa, Canada nel October 2023) [10.1145/3581783.3611958].

FedVQA: Personalized Federated Visual Question Answering over Heterogeneous Scenes

Pu, Nan;Zhong, Zhun;Sebe, Nicu;
2023-01-01

Abstract

This paper presents a new setting for visual question answering (VQA) called personalized federated VQA (FedVQA) that addresses the growing need for decentralization and data privacy protection. FedVQA is both practical and challenging, requiring clients to learn well-personalized models on scene-specific datasets with severe feature/label distribution skews. These models then collaborate to optimize a generic global model on a central server, which is desired to generalize well on both seen and unseen scenes without sharing raw data with the server and other clients. The primary challenge of FedVQA is that, client models tend to forget the global knowledge initialized from central server during the personalized training, which impairs their personalized capacity due to the potential overfitting issue on local data. This further leads to divergence issues when aggregating distinct personalized knowledge at the central server, resulting in an inferior generalization ability on unseen scenes. To address the challenge, we propose a novel federated pairwise preference preserving (FedP3 ) framework to improve personalized learning via preserving generic knowledge under FedVQA constraints. Specifically, we first design a differentiable pairwise preference (DPP) to improve knowledge preserving by formulating a flexible yet effective global knowledge. Then, we introduce a forgotten-knowledge filter (FKF) to encourage the client models to selectively consolidate easily-forgotten knowledge. By aggregating the DPP and the FKF, FedP3 coordinates the generic and the personalized knowledge to enhance the personalized ability of clients and generalizability of the server. Extensive experiments show that FedP3 consistently surpasses the state-of-the-art in FedVQA task.
2023
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
New York, NY, United States
ACM
9798400701085
Lao, Mingrui; Pu, Nan; Zhong, Zhun; Sebe, Nicu; Lew, Michael S.
FedVQA: Personalized Federated Visual Question Answering over Heterogeneous Scenes / Lao, Mingrui; Pu, Nan; Zhong, Zhun; Sebe, Nicu; Lew, Michael S.. - (2023), pp. 7796-7807. (Intervento presentato al convegno ACM Multimedia tenutosi a Ottawa, Canada nel October 2023) [10.1145/3581783.3611958].
File in questo prodotto:
File Dimensione Formato  
3581783.3611958 (1)-compressed.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.53 MB
Formato Adobe PDF
1.53 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/398410
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact