Head Computed Tomography (CT) scans are commonly used in emergency departments to identify neurological conditions. However, manual radiology reporting is time-consuming and subject to cognitive biases, especially under time constraints. This work presents a self-attentive deep fusion framework for automatic radiology reporting tailored to emergency head CT imaging. Building on our previously developed system, the proposed model incorporates clinically validated preprocessing for head CTs, a multimodal neural network refined through fine-grained regularization, and intra-sequence self-attention to enhance context modeling. Eight uniformly resampled slices per scan are used as input, and each report section is generated as an independent caption and semantically ranked using a Transformer-based evaluator based on perplexity. The best predictions are concatenated to form ordered reports. The model was trained and evaluated on a dataset of real-world emergency head CTs using ROUGE-L and METEOR as quantitative metrics. The proposed system achieved improvements of +6% in ROUGE-L and +3% in METEOR compared to the reference architecture, and +4% and +2% compared to the same model without self-attention. These results confirm the contribution of self-attention for improving contextual coherence and report quality. The framework remains lightweight and scalable, making it suitable for time-sensitive environments. Future work will explore condition-specific modeling and vision-language model integration.
Self-Attentive Deep Fusion Framework with Transformer-Based Semantics for Emergency Head CT Reporting / Tomassini, Selene; Zeggada, Abdallah; Quattrocchi, Carlo Cosimo; Melgani, Farid; Giorgini, Paolo. - (2025), pp. 705-710. ( IEEE MetroXRAINE 2025 Ancona, Italy 22-24/10/2025) [10.1109/metroxraine66377.2025.11340501].
Self-Attentive Deep Fusion Framework with Transformer-Based Semantics for Emergency Head CT Reporting
Tomassini, Selene
Primo
;Zeggada, Abdallah;Quattrocchi, Carlo CosimoCo-ultimo
;Melgani, FaridCo-ultimo
;Giorgini, PaoloCo-ultimo
2025-01-01
Abstract
Head Computed Tomography (CT) scans are commonly used in emergency departments to identify neurological conditions. However, manual radiology reporting is time-consuming and subject to cognitive biases, especially under time constraints. This work presents a self-attentive deep fusion framework for automatic radiology reporting tailored to emergency head CT imaging. Building on our previously developed system, the proposed model incorporates clinically validated preprocessing for head CTs, a multimodal neural network refined through fine-grained regularization, and intra-sequence self-attention to enhance context modeling. Eight uniformly resampled slices per scan are used as input, and each report section is generated as an independent caption and semantically ranked using a Transformer-based evaluator based on perplexity. The best predictions are concatenated to form ordered reports. The model was trained and evaluated on a dataset of real-world emergency head CTs using ROUGE-L and METEOR as quantitative metrics. The proposed system achieved improvements of +6% in ROUGE-L and +3% in METEOR compared to the reference architecture, and +4% and +2% compared to the same model without self-attention. These results confirm the contribution of self-attention for improving contextual coherence and report quality. The framework remains lightweight and scalable, making it suitable for time-sensitive environments. Future work will explore condition-specific modeling and vision-language model integration.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025 - IEEE MetroXRAINE.pdf
Solo gestori archivio
Descrizione: Paper_IEEEMetroXRAINE25
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
499.15 kB
Formato
Adobe PDF
|
499.15 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



