Unveiling Open-set Noise: Theoretical Insights into Label Noise

Feng, Chen; Sebe, Nicu; Tzimiropoulos, Georgios; Rodrigues, Miguel R. D.; Patras, Ioannis

doi:10.1145/3746027.3755040

Learning with Noisy Labels (LNL) reduces reliance on high-quality labeled data but often overlooks open-set noise, where noisy samples belong to unknown classes, unlike closed-set noise within known categories.This paper advances LNL by reformulating the problem to incorporate open-set noise through a complete noise transition matrix, enabling a theoretical comparison of its impact on classification error rates against closed-set noise. Our analysis reveals that open-set noise induces smaller error increases, with distinct effects from 'hard' (semantically similar to inliers) and 'easy' (dissimilar) variants. We evaluate entropy-based detection, finding it effective only for easy open-set noise, and propose solutions leveraging vision-language models and self-supervised learning to address hard noise challenges. For empirical validation, we introduce CIFAR100-O, ImageNet-O, and a WebVision open-set test set, enabling robust benchmarking of LNL methods under open-set noise conditions. Recognizing classification accuracy's limitations in capturing model robustness, we advocate out-of-distribution (OOD) detection as a complementary metric. Our theoretical and empirical results highlight the unique challenges of open-set noise, offering new tools and evaluation frameworks to enhance LNL robustness in real-world scenarios.

Unveiling Open-set Noise: Theoretical Insights into Label Noise / Feng, Chen; Sebe, Nicu; Tzimiropoulos, Georgios; Rodrigues, Miguel R. D.; Patras, Ioannis. - (2025), pp. 3290-3299. ( ACM Multimedia Dublin October 2025) [10.1145/3746027.3755040].