Visual Question Answering (VQA) has achieved significant success over the last few years, while most studies focus on training a VQA model on a stationary domain (e.g., a given dataset). In real-world application scenarios, however, these methods are often inefficient because VQA systems are always supposed to extend their knowledge and meet the ever-changing demands of users. In this paper, we introduce a new and challenging multi-domain lifelong VQA task, dubbed MDL-VQA, which encourages the VQA model to continuously learn across multiple domains while mitigating the forgetting on previously-learned domains. Furthermore, we propose a novel replay-free Self-Critical Distillation (SCD) framework tailor-made for MDL-VQA, which alleviates forgetting issue via transferring previous-domain knowledge from teacher to student models. First, we propose to introspect the teacher’s understanding over original and counterfactual samples, thereby creating informative instance-relevant and domain-relevant knowledge for logits-based distillation. Second, on the side of feature-based distillation, we propose to introspect the reasoning behavior of student model to establish the harmful domain-specific knowledge acquired in current domain, and further leverage the metric learning strategy to encourage student to learn useful knowledge in new domain. Extensive experiments demonstrate that SCD framework outperforms state-of-the-art competitors with different training orders.
Multi-Domain Lifelong Visual Question Answering via Self-Critical Distillation / Lao, Mingrui; Pu, Nan; Liu, Yu; Zhong, Zhun; Bakker, Erwin M.; Sebe, Nicu; Lew, Michael S.. - (2023), pp. 4747-4758. (Intervento presentato al convegno The 31st ACM International Conference on Multimedia tenutosi a Ottawa, Canada nel 29 October - 3 November 2023) [10.1145/3581783.3612121].
Multi-Domain Lifelong Visual Question Answering via Self-Critical Distillation
Pu, Nan;Zhong, Zhun;Sebe, Nicu;
2023-01-01
Abstract
Visual Question Answering (VQA) has achieved significant success over the last few years, while most studies focus on training a VQA model on a stationary domain (e.g., a given dataset). In real-world application scenarios, however, these methods are often inefficient because VQA systems are always supposed to extend their knowledge and meet the ever-changing demands of users. In this paper, we introduce a new and challenging multi-domain lifelong VQA task, dubbed MDL-VQA, which encourages the VQA model to continuously learn across multiple domains while mitigating the forgetting on previously-learned domains. Furthermore, we propose a novel replay-free Self-Critical Distillation (SCD) framework tailor-made for MDL-VQA, which alleviates forgetting issue via transferring previous-domain knowledge from teacher to student models. First, we propose to introspect the teacher’s understanding over original and counterfactual samples, thereby creating informative instance-relevant and domain-relevant knowledge for logits-based distillation. Second, on the side of feature-based distillation, we propose to introspect the reasoning behavior of student model to establish the harmful domain-specific knowledge acquired in current domain, and further leverage the metric learning strategy to encourage student to learn useful knowledge in new domain. Extensive experiments demonstrate that SCD framework outperforms state-of-the-art competitors with different training orders.File | Dimensione | Formato | |
---|---|---|---|
3581783.3612121 (2)-compressed.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
473.55 kB
Formato
Adobe PDF
|
473.55 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione