Objectives Lung ultrasound (LUS) has sparked significant interest during COVID-19. LUS is based on the detection and analysis of imaging patterns. Vertical artifacts and consolidations are some of the recognized patterns in COVID-19. However, the interrater reliability (IRR) of these findings has not been yet thoroughly investigated. The goal of this study is to assess IRR in LUS COVID-19 data and determine how many LUS videos and operators are required to obtain a reliable result. Methods A total of 1035 LUS videos from 59 COVID-19 patients were included. Videos were randomly selected from a dataset of 1807 videos and scored by six human operators (HOs). The videos were also analyzed by artificial intelligence (AI) algorithms. Fleiss' kappa coefficient results are presented, evaluated at both the video and prognostic levels. Results Findings show a stable agreement when evaluating a minimum of 500 videos. The statistical analysis illustrates that, at a video level, a Fleiss' kappa coefficient of 0.464 (95% confidence interval [CI] = 0.455–0.473) and 0.404 (95% CI = 0.396–0.412) is obtained for pairs of HOs and for AI versus HOs, respectively. At prognostic level, a Fleiss' kappa coefficient of 0.505 (95% CI = 0.448–0.562) and 0.506 (95% CI = 0.458–0.555) is obtained for pairs of HOs and for AI versus HOs, respectively. Conclusions To examine IRR and obtain a reliable evaluation, a minimum of 500 videos are recommended. Moreover, the employed AI algorithms achieve results that are comparable with HOs. This research further provides a methodology that can be useful to benchmark future LUS studies.

Human‐to‐AI Interrater Agreement for Lung Ultrasound Scoring in COVID‐19 Patients / Fatima, Noreen; Mento, Federico; Zanforlin, Alessandro; Smargiassi, Andrea; Torri, Elena; Perrone, Tiziano; Demi, Libertario. - In: JOURNAL OF ULTRASOUND IN MEDICINE. - ISSN 1550-9613. - 42:4(2023), pp. 843-851. [10.1002/jum.16052]

Human‐to‐AI Interrater Agreement for Lung Ultrasound Scoring in COVID‐19 Patients

Fatima, Noreen;Mento, Federico;Demi, Libertario
2023-01-01

Abstract

Objectives Lung ultrasound (LUS) has sparked significant interest during COVID-19. LUS is based on the detection and analysis of imaging patterns. Vertical artifacts and consolidations are some of the recognized patterns in COVID-19. However, the interrater reliability (IRR) of these findings has not been yet thoroughly investigated. The goal of this study is to assess IRR in LUS COVID-19 data and determine how many LUS videos and operators are required to obtain a reliable result. Methods A total of 1035 LUS videos from 59 COVID-19 patients were included. Videos were randomly selected from a dataset of 1807 videos and scored by six human operators (HOs). The videos were also analyzed by artificial intelligence (AI) algorithms. Fleiss' kappa coefficient results are presented, evaluated at both the video and prognostic levels. Results Findings show a stable agreement when evaluating a minimum of 500 videos. The statistical analysis illustrates that, at a video level, a Fleiss' kappa coefficient of 0.464 (95% confidence interval [CI] = 0.455–0.473) and 0.404 (95% CI = 0.396–0.412) is obtained for pairs of HOs and for AI versus HOs, respectively. At prognostic level, a Fleiss' kappa coefficient of 0.505 (95% CI = 0.448–0.562) and 0.506 (95% CI = 0.458–0.555) is obtained for pairs of HOs and for AI versus HOs, respectively. Conclusions To examine IRR and obtain a reliable evaluation, a minimum of 500 videos are recommended. Moreover, the employed AI algorithms achieve results that are comparable with HOs. This research further provides a methodology that can be useful to benchmark future LUS studies.
2023
4
Fatima, Noreen; Mento, Federico; Zanforlin, Alessandro; Smargiassi, Andrea; Torri, Elena; Perrone, Tiziano; Demi, Libertario
Human‐to‐AI Interrater Agreement for Lung Ultrasound Scoring in COVID‐19 Patients / Fatima, Noreen; Mento, Federico; Zanforlin, Alessandro; Smargiassi, Andrea; Torri, Elena; Perrone, Tiziano; Demi, Libertario. - In: JOURNAL OF ULTRASOUND IN MEDICINE. - ISSN 1550-9613. - 42:4(2023), pp. 843-851. [10.1002/jum.16052]
File in questo prodotto:
File Dimensione Formato  
J of Ultrasound Medicine - 2022 - Fatima - Human‐to‐AI Interrater Agreement for Lung Ultrasound Scoring in COVID‐19.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/350261
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 30
  • ???jsp.display-item.citation.isi??? 27
  • OpenAlex ND
social impact