Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI)

Topff, Laurens; Groot Lipman, Kevin B. W.; Guffens, Frederic; Wittenberg, Rianne; Bartels-Rutten, Annemarieke; van Veenendaal, Gerben; Hess, Mirco; Lamerigts, Kay; Wakkie, Joris; Ranschaert, Erik; Trebeschi, Stefano; Visser, &amp;; Jacob, J.; Beets-Tan, Regina G. H.; Guiot, Julien; Snoeckx, Annemiek; Kint, Peter; Van Hoe, Lieven; Quattrocchi, Carlo Cosimo; Dieckens, Dennis; Lounis, Samir; Schulze, Eric; Arnout, Sjer; Sjer, Eric-bart; van Vucht, Niels; Tielbeek, Jeroen A. W.; Raat, Frank; Eijspaart, Daniël; Abbas, Ausami

doi:10.1007/s00330-022-09303-3

Objectives: Only few published artificial intelligence (AI) studies for COVID-19 imaging have been externally validated. Assessing the generalizability of developed models is essential, especially when considering clinical implementation. We report the development of the International Consortium for COVID-19 Imaging AI (ICOVAI) model and perform independent external validation. Methods: The ICOVAI model was developed using multicenter data (n = 1286 CT scans) to quantify disease extent and assess COVID-19 likelihood using the COVID-19 Reporting and Data System (CO-RADS). A ResUNet model was modified to automatically delineate lung contours and infectious lung opacities on CT scans, after which a random forest predicted the CO-RADS score. After internal testing, the model was externally validated on a multicenter dataset (n = 400) by independent researchers. CO-RADS classification performance was calculated using linearly weighted Cohen’s kappa and segmentation performance using Dice Similarity Coefficient (DSC). Results: Regarding internal versus external testing, segmentation performance of lung contours was equally excellent (DSC = 0.97 vs. DSC = 0.97, p = 0.97). Lung opacities segmentation performance was adequate internally (DSC = 0.76), but significantly worse on external validation (DSC = 0.59, p < 0.0001). For CO-RADS classification, agreement with radiologists on the internal set was substantial (kappa = 0.78), but significantly lower on the external set (kappa = 0.62, p < 0.0001). Conclusion: In this multicenter study, a model developed for CO-RADS score prediction and quantification of COVID-19 disease extent was found to have a significant reduction in performance on independent external validation versus internal testing. The limited reproducibility of the model restricted its potential for clinical use. The study demonstrates the importance of independent external validation of AI models. Key Points: • The ICOVAI model for prediction of CO-RADS and quantification of disease extent on chest CT of COVID-19 patients was developed using a large sample of multicenter data. • There was substantial performance on internal testing; however, performance was significantly reduced on external validation, performed by independent researchers. The limited generalizability of the model restricts its potential for clinical use. • Results of AI models for COVID-19 imaging on internal tests may not generalize well to external data, demonstrating the importance of independent external validation.

Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI) / Topff, Laurens; Groot Lipman, Kevin B. W.; Guffens, Frederic; Wittenberg, Rianne; Bartels-Rutten, Annemarieke; van Veenendaal, Gerben; Hess, Mirco; Lamerigts, Kay; Wakkie, Joris; Ranschaert, Erik; Trebeschi, Stefano; Visser, &; Jacob, J.; Beets-Tan, Regina G. H.; Guiot, Julien; Snoeckx, Annemiek; Kint, Peter; Van Hoe, Lieven; Quattrocchi, Carlo Cosimo; Dieckens, Dennis; Lounis, Samir; Schulze, Eric; Sjer, Arnout; Eric-bart, Sjer; van Vucht, Niels; Tielbeek, Jeroen A. W.; Raat, Frank; Eijspaart, Daniël; Abbas, Ausami. - In: EUROPEAN RADIOLOGY. - ISSN 0938-7994. - 33:6(2023), pp. 4249-4258. [10.1007/s00330-022-09303-3]

Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI)

Topff, Laurens;Groot Lipman, Kevin B. W.;Guffens, Frederic;Wittenberg, Rianne;Bartels-Rutten, Annemarieke;van Veenendaal, Gerben;Hess, Mirco;Lamerigts, Kay;Wakkie, Joris;Ranschaert, Erik;Trebeschi, Stefano;Visser, &;Jacob J.;Beets-Tan, Regina G. H.;Guiot, Julien;Snoeckx, Annemiek;Kint, Peter;Van Hoe, Lieven;Quattrocchi, Carlo Cosimo;Dieckens, Dennis;Lounis, Samir;Schulze, Eric;Sjer Arnout;Eric-bart Sjer;van Vucht, Niels;Tielbeek, Jeroen A. W.;Raat, Frank;Eijspaart, Daniël;Abbas, Ausami

2023-01-01

Abstract

Objectives: Only few published artificial intelligence (AI) studies for COVID-19 imaging have been externally validated. Assessing the generalizability of developed models is essential, especially when considering clinical implementation. We report the development of the International Consortium for COVID-19 Imaging AI (ICOVAI) model and perform independent external validation. Methods: The ICOVAI model was developed using multicenter data (n = 1286 CT scans) to quantify disease extent and assess COVID-19 likelihood using the COVID-19 Reporting and Data System (CO-RADS). A ResUNet model was modified to automatically delineate lung contours and infectious lung opacities on CT scans, after which a random forest predicted the CO-RADS score. After internal testing, the model was externally validated on a multicenter dataset (n = 400) by independent researchers. CO-RADS classification performance was calculated using linearly weighted Cohen’s kappa and segmentation performance using Dice Similarity Coefficient (DSC). Results: Regarding internal versus external testing, segmentation performance of lung contours was equally excellent (DSC = 0.97 vs. DSC = 0.97, p = 0.97). Lung opacities segmentation performance was adequate internally (DSC = 0.76), but significantly worse on external validation (DSC = 0.59, p < 0.0001). For CO-RADS classification, agreement with radiologists on the internal set was substantial (kappa = 0.78), but significantly lower on the external set (kappa = 0.62, p < 0.0001). Conclusion: In this multicenter study, a model developed for CO-RADS score prediction and quantification of COVID-19 disease extent was found to have a significant reduction in performance on independent external validation versus internal testing. The limited reproducibility of the model restricted its potential for clinical use. The study demonstrates the importance of independent external validation of AI models. Key Points: • The ICOVAI model for prediction of CO-RADS and quantification of disease extent on chest CT of COVID-19 patients was developed using a large sample of multicenter data. • There was substantial performance on internal testing; however, performance was significantly reduced on external validation, performed by independent researchers. The limited generalizability of the model restricts its potential for clinical use. • Results of AI models for COVID-19 imaging on internal tests may not generalize well to external data, demonstrating the importance of independent external validation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del periodico (Journal title)
	
				EUROPEAN RADIOLOGY
			
	Numero e parte del fascicolo (Issue number and part)
	
				6
			
	DOI
	
				https://dx.doi.org/10.1007/s00330-022-09303-3
			
	Codice PubMed (PubMed Identifier)
	
				36651954
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85146600661
			
	Codice WOS (WOS identifier)
	
				WOS:000926626700002
			
	Tutti gli autori
	
						Topff, Laurens; Groot Lipman, Kevin B. W.; Guffens, Frederic; Wittenberg, Rianne; Bartels-Rutten, Annemarieke; van Veenendaal, Gerben; Hess, Mirco; La...espandi
						
	Citazione
	
				Is the generalizability of a developed artificial intelligence algorithm for COVID-19 on chest CT sufficient for clinical use? Results from the International Consortium for COVID-19 Imaging AI (ICOVAI) / Topff, Laurens; Groot Lipman, Kevin B. W.; Guffens, Frederic; Wittenberg, Rianne; Bartels-Rutten, Annemarieke; van Veenendaal, Gerben; Hess, Mirco; Lamerigts, Kay; Wakkie, Joris; Ranschaert, Erik; Trebeschi, Stefano; Visser, &amp;; Jacob, J.; Beets-Tan, Regina G. H.; Guiot, Julien; Snoeckx, Annemiek; Kint, Peter; Van Hoe, Lieven; Quattrocchi, Carlo Cosimo; Dieckens, Dennis; Lounis, Samir; Schulze, Eric; Sjer, Arnout; Eric-bart, Sjer; van Vucht, Niels; Tielbeek, Jeroen A. W.; Raat, Frank; Eijspaart, Daniël; Abbas, Ausami. - In: EUROPEAN RADIOLOGY. - ISSN 0938-7994. - 33:6(2023), pp. 4249-4258. [10.1007/s00330-022-09303-3]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
330_2022_Article_9303.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 2.15 MB Formato Adobe PDF Visualizza/Apri	2.15 MB	Adobe PDF	Visualizza/Apri