Compositional zero-shot learning (CZSL) aims to model compositions of two primitives (i.e., attributes and objects) to classify unseen attribute-object pairs. Most studies are devoted to integrating disentanglement and entanglement strategies to circumvent the trade-off between contextuality and generalizability. Indeed, the two strategies can mutually benefit when used together. Nevertheless, they neglect the significance of developing mutual guidance between the two strategies. In this work, we take full advantage of guidance from disentanglement to entanglement and vice versa. Additionally, we propose exploring multi-scale feature learning to achieve fine-grained mutual guidance in a progressive framework. Our approach, termed Progressive Mutual Guidance Network (PMGNet), unifies disentanglement–entanglement representation learning, allowing them to learn from and teach each other progressively in one unified model. Furthermore, to alleviate overfitting recognition on seen pairs, we adopt a relaxed cross-entropy loss to train PMGNet, without an increase of time and memory cost. Extensive experiments on three benchmarks demonstrate that our method achieves distinct improvements, reaching state-of-the-art performance. Moreover, PMGNet exhibits promising performance under the most challenging open-world CZSL setting, especially for unseen pairs.

PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning / Liu, Yu; Li, Jianghao; Zhang, Yanyi; Jia, Qi; Wang, Weimin; Pu, Nan; Sebe, Nicu. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 249:(2024), pp. 10419701-10419711. [10.1016/j.cviu.2024.104197]

PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning

Pu, Nan;Sebe, Nicu
2024-01-01

Abstract

Compositional zero-shot learning (CZSL) aims to model compositions of two primitives (i.e., attributes and objects) to classify unseen attribute-object pairs. Most studies are devoted to integrating disentanglement and entanglement strategies to circumvent the trade-off between contextuality and generalizability. Indeed, the two strategies can mutually benefit when used together. Nevertheless, they neglect the significance of developing mutual guidance between the two strategies. In this work, we take full advantage of guidance from disentanglement to entanglement and vice versa. Additionally, we propose exploring multi-scale feature learning to achieve fine-grained mutual guidance in a progressive framework. Our approach, termed Progressive Mutual Guidance Network (PMGNet), unifies disentanglement–entanglement representation learning, allowing them to learn from and teach each other progressively in one unified model. Furthermore, to alleviate overfitting recognition on seen pairs, we adopt a relaxed cross-entropy loss to train PMGNet, without an increase of time and memory cost. Extensive experiments on three benchmarks demonstrate that our method achieves distinct improvements, reaching state-of-the-art performance. Moreover, PMGNet exhibits promising performance under the most challenging open-world CZSL setting, especially for unseen pairs.
2024
Liu, Yu; Li, Jianghao; Zhang, Yanyi; Jia, Qi; Wang, Weimin; Pu, Nan; Sebe, Nicu
PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning / Liu, Yu; Li, Jianghao; Zhang, Yanyi; Jia, Qi; Wang, Weimin; Pu, Nan; Sebe, Nicu. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 249:(2024), pp. 10419701-10419711. [10.1016/j.cviu.2024.104197]
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1077314224002789-main (1).pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.11 MB
Formato Adobe PDF
2.11 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/436966
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact