Neonatal respiratory disorders present significant challenges in clinical settings, often demanding rapid and accurate diagnostic solutions for effective patient management. In recent times, lung ultrasound (LUS) has played a notable role in the evaluation of neonatal patients affected by respiratory diseases. However, limited work has been done to automate the process to assist clinicians. To this extent, sufficient representative data is required for reliable training of deep learning (DL)-based methods. To address the challenge of the limited availability of annotated data, this study aims to utilize domain knowledge from an existing adult patient population for effective video-level classification of LUS patterns in newborns. This study introduces TranSLUCEnT, a transformer-based video-level LUS pattern classification model that employs the transfer of domain knowledge from adults to newborns. To this extent, it uses frame-level encodings of LUS data from newborns, extracted from a ResNet-18 model previously trained on LUS data from adults. This allows the model to effectively capture and transfer relevant features to the task of analyzing LUS data from newborns. The proposed model is evaluated on 417 neonatal videos from 35 patients in a leave-one-out cross-validation manner, leaving data from a single exam out in each fold. Results showed that TranSLUCEnT achieved a mean video-level accuracy of 76.2% w.r.t. the majority label provided by 3 clinical operators as ground truth. In comparison, the state-of-the-art video vision transformer (ViViT), when trained from scratch, achieved an accuracy of 49.8%. Moreover, while ViViT processes N tokens per frame in a video of M frames (N x M tokens), TranSLUCEnT processes only M tokens. As a result, it achieves a significantly higher video classification performance with lower computational cost compared to the state-of-the-art ViViT model.
TranSLUCEnT: Transferred Sequential Lung Ultrasound Characteristic Encodings-based Transformer for Lung Ultrasound Pattern Classification in Premature Neonates / Khan, U., Fatima, N., Han, X.i., Rigotti, C., Cattaneo, F., Dognini, G., Ventura, M.L., Zannin, E., Iacca, G., Demi, L.. - (2024), pp. 1-4. (2024 IEEE Ultrasonics, Ferroelectrics, and Frequency Control Joint Symposium, UFFC-JS 2024 Taipei Nangang Exhibition Center, Hall 1, No.1, Jingmao 2nd Rd., Nangang District, twn 2024) [10.1109/uffc-js60046.2024.10793539].
TranSLUCEnT: Transferred Sequential Lung Ultrasound Characteristic Encodings-based Transformer for Lung Ultrasound Pattern Classification in Premature Neonates
Khan, Umair;Fatima, Noreen;Han, Xi;Iacca, Giovanni;Demi, Libertario
2024-01-01
Abstract
Neonatal respiratory disorders present significant challenges in clinical settings, often demanding rapid and accurate diagnostic solutions for effective patient management. In recent times, lung ultrasound (LUS) has played a notable role in the evaluation of neonatal patients affected by respiratory diseases. However, limited work has been done to automate the process to assist clinicians. To this extent, sufficient representative data is required for reliable training of deep learning (DL)-based methods. To address the challenge of the limited availability of annotated data, this study aims to utilize domain knowledge from an existing adult patient population for effective video-level classification of LUS patterns in newborns. This study introduces TranSLUCEnT, a transformer-based video-level LUS pattern classification model that employs the transfer of domain knowledge from adults to newborns. To this extent, it uses frame-level encodings of LUS data from newborns, extracted from a ResNet-18 model previously trained on LUS data from adults. This allows the model to effectively capture and transfer relevant features to the task of analyzing LUS data from newborns. The proposed model is evaluated on 417 neonatal videos from 35 patients in a leave-one-out cross-validation manner, leaving data from a single exam out in each fold. Results showed that TranSLUCEnT achieved a mean video-level accuracy of 76.2% w.r.t. the majority label provided by 3 clinical operators as ground truth. In comparison, the state-of-the-art video vision transformer (ViViT), when trained from scratch, achieved an accuracy of 49.8%. Moreover, while ViViT processes N tokens per frame in a video of M frames (N x M tokens), TranSLUCEnT processes only M tokens. As a result, it achieves a significantly higher video classification performance with lower computational cost compared to the state-of-the-art ViViT model.| File | Dimensione | Formato | |
|---|---|---|---|
|
TranSLUCEnT_Transferred_Sequential_Lung_Ultrasound_Characteristic_Encodings-based_Transformer_for_Lung_Ultrasound_Pattern_Classification_in_Premature_Neonates.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.19 MB
Formato
Adobe PDF
|
1.19 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



