Skim-Attention: Learning to Focus via Document Layout

IRIS

Transformer-based pre-training techniques of text and layout have proven effective in a number of document understanding tasks. Despite this success, multimodal pre-training models suffer from very high computational and memory costs. Motivated by human reading strategies, this paper presents Skim-Attention, a new attention mechanism that takes advantage of the structure of the document and its layout. Skim-Attention only attends to the 2- dimensional position of the words in a document. Our experiments show that SkimAttention obtains a lower perplexity than prior works, while being more computationally efficient. Skim-Attention can be further combined with long-range Transformers to efficiently process long documents. We also show how Skim-Attention can be used off-the-shelf as a mask for any Pre-trained Language Model, allowing to improve their performance while restricting attention. Finally, we show the emergence of a document structure representation in Skim-Attention.

Skim-Attention: Learning to Focus via Document Layout / Nguyen, Laura; Scialom, Thomas; Staiano, Jacopo; Piwowarski, Benjamin. - (2021), pp. 2413-2427. (Intervento presentato al convegno EMNLP tenutosi a Punta Cana, Dominican Republic nel 7th-11th November 2021) [10.18653/v1/2021.findings-emnlp.207].

Skim-Attention: Learning to Focus via Document Layout

Nguyen, Laura^Primo;Scialom, Thomas^Secondo;Staiano, Jacopo^Penultimo;Piwowarski, Benjamin^Ultimo

2021-01-01

Abstract

Transformer-based pre-training techniques of text and layout have proven effective in a number of document understanding tasks. Despite this success, multimodal pre-training models suffer from very high computational and memory costs. Motivated by human reading strategies, this paper presents Skim-Attention, a new attention mechanism that takes advantage of the structure of the document and its layout. Skim-Attention only attends to the 2- dimensional position of the words in a document. Our experiments show that SkimAttention obtains a lower perplexity than prior works, while being more computationally efficient. Skim-Attention can be further combined with long-range Transformers to efficiently process long documents. We also show how Skim-Attention can be used off-the-shelf as a mask for any Pre-trained Language Model, allowing to improve their performance while restricting attention. Finally, we show the emergence of a document structure representation in Skim-Attention.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2021
			
	Titolo del volume (Proceedings title)
	
				Findings of the Association for Computational Linguistics: EMNLP 2021
			
	Luogo di edizione (Place of publication)
	
				Stroudsburg, PA, USA
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics
			
	ISBN
	
				978-1-955917-10-0
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85129154862
			
	Codice WOS (WOS identifier)
	
				WOS:001181828801039
			
	Tutti gli autori
	
						Nguyen, Laura; Scialom, Thomas; Staiano, Jacopo; Piwowarski, Benjamin
					
	Citazione
	
				Skim-Attention: Learning to Focus via Document Layout / Nguyen, Laura; Scialom, Thomas; Staiano, Jacopo; Piwowarski, Benjamin. - (2021), pp. 2413-2427. (Intervento presentato al  convegno EMNLP tenutosi a Punta Cana, Dominican Republic nel 7th-11th November 2021) [10.18653/v1/2021.findings-emnlp.207].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2021.findings-emnlp.207.pdf Solo gestori archivio Descrizione: Skim-Attention: Learning to Focus via Document Layout Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.01 MB Formato Adobe PDF Visualizza/Apri	2.01 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/392049

Citazioni

ND

4

1

ND

social impact