Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations to be performed, thus potentially biasing the information learned by the network during self-training. Moreover, several unsupervised methods only focus on uni-modal information, thus potentially introducing challenges in the case of sparse and textureless point clouds. To address these issues, we propose an augmentation-free unsupervised approach for point clouds, named CluRender, to learn transferable point-level features by leveraging uni-modal information for soft clustering and cross-modal information for neural rendering. Soft clustering enables self-training through a pseudo-label prediction task, where the affiliation of points to their clusters is used as a proxy under the constraint that these pseudo-labels divide the point cloud into approximate equal partitions. This allows us to formulate a clustering loss to minimize the standard cross-entropy between pseudo and predicted labels. Neural rendering generates photorealistic renderings from various viewpoints to transfer photometric cues from 2D images to the features. The consistency between rendered and real images is then measured to form a fitting loss, combined with the cross-entropy loss to self-train networks. Experiments on downstream applications, including 3D object detection, semantic segmentation, classification, part segmentation, and few-shot learning, demonstrate the effectiveness of our framework in outperforming state-of-the-art techniques.

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering / Mei, Guofeng; Saltori, Cristiano; Ricci, Elisa; Sebe, Nicu; Wu, Qiang; Zhang, Jian; Poiesi, Fabio. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - 132:8(2024), pp. 3251-3269. [10.1007/s11263-024-02027-5]

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Saltori, Cristiano;Ricci, Elisa;Sebe, Nicu;Poiesi, Fabio
2024-01-01

Abstract

Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations to be performed, thus potentially biasing the information learned by the network during self-training. Moreover, several unsupervised methods only focus on uni-modal information, thus potentially introducing challenges in the case of sparse and textureless point clouds. To address these issues, we propose an augmentation-free unsupervised approach for point clouds, named CluRender, to learn transferable point-level features by leveraging uni-modal information for soft clustering and cross-modal information for neural rendering. Soft clustering enables self-training through a pseudo-label prediction task, where the affiliation of points to their clusters is used as a proxy under the constraint that these pseudo-labels divide the point cloud into approximate equal partitions. This allows us to formulate a clustering loss to minimize the standard cross-entropy between pseudo and predicted labels. Neural rendering generates photorealistic renderings from various viewpoints to transfer photometric cues from 2D images to the features. The consistency between rendered and real images is then measured to form a fitting loss, combined with the cross-entropy loss to self-train networks. Experiments on downstream applications, including 3D object detection, semantic segmentation, classification, part segmentation, and few-shot learning, demonstrate the effectiveness of our framework in outperforming state-of-the-art techniques.
2024
8
Mei, Guofeng; Saltori, Cristiano; Ricci, Elisa; Sebe, Nicu; Wu, Qiang; Zhang, Jian; Poiesi, Fabio
Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering / Mei, Guofeng; Saltori, Cristiano; Ricci, Elisa; Sebe, Nicu; Wu, Qiang; Zhang, Jian; Poiesi, Fabio. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - 132:8(2024), pp. 3251-3269. [10.1007/s11263-024-02027-5]
File in questo prodotto:
File Dimensione Formato  
UnsupervisedPointCloud.pdf

accesso aperto

Descrizione: Online First
Tipologia: Altro materiale allegato (Other attachments)
Licenza: Creative commons
Dimensione 2.82 MB
Formato Adobe PDF
2.82 MB Adobe PDF Visualizza/Apri
s11263-024-02027-5.pdf

accesso aperto

Descrizione: Versione editoriale finale
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 2.81 MB
Formato Adobe PDF
2.81 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/420270
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact