Active annotation: Bootstrapping annotation lexicon and guidelines for supervised NLU learning

IRIS

Natural Language Understanding (NLU) models are typically trained in a supervised learning framework. In the case of intent classification, the predicted labels are predefined and based on the designed annotation schema while the labeling process is based on a laborious task where annotators manually inspect each utterance and assign the corresponding label. We propose an Active Annotation (AA) approach where we combine an unsupervised learning method in the embedding space, a human-in-the-loop verification process, and linguistic insights to create lexicons that can be open categories and adapted over time. In particular, annotators define the y-label space on-the-fly during the annotation using an iterative process and without the need for prior knowledge about the input data. We evaluate the proposed annotation paradigm in a real use-case NLU scenario. Results show that our Active Annotation paradigm achieves accurate and higher quality training data, with an annotation speed of an order of magnitude higher with respect to the traditional human-only driven baseline annotation methodology.

Active annotation: Bootstrapping annotation lexicon and guidelines for supervised NLU learning / Marinelli, Franca; Cervone, A.; Tortoreto, G.; Stepanov, E. A.; Fabbrizio, G. D.; Riccardi, G.. - (2019), pp. 574-578. (Intervento presentato al convegno 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 tenutosi a Graz nel 15th-19th September 2019) [10.21437/Interspeech.2019-2537].

Active annotation: Bootstrapping annotation lexicon and guidelines for supervised NLU learning

Marinelli, Franca;Cervone A.;Tortoreto G.;Stepanov E. A.;Fabbrizio G. D.;Riccardi G.

2019-01-01

Abstract

Natural Language Understanding (NLU) models are typically trained in a supervised learning framework. In the case of intent classification, the predicted labels are predefined and based on the designed annotation schema while the labeling process is based on a laborious task where annotators manually inspect each utterance and assign the corresponding label. We propose an Active Annotation (AA) approach where we combine an unsupervised learning method in the embedding space, a human-in-the-loop verification process, and linguistic insights to create lexicons that can be open categories and adapted over time. In particular, annotators define the y-label space on-the-fly during the annotation using an iterative process and without the need for prior knowledge about the input data. We evaluate the proposed annotation paradigm in a real use-case NLU scenario. Results show that our Active Annotation paradigm achieves accurate and higher quality training data, with an annotation speed of an order of magnitude higher with respect to the traditional human-only driven baseline annotation methodology.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2019
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
			
	Luogo di edizione (Place of publication)
	
				Baixas, France
			
	Casa editrice (Publisher)
	
				International Speech Communication Association
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85074681988
			
	Codice WOS (WOS identifier)
	
				WOS:000831796400116
			
	Tutti gli autori
	
						Marinelli, Franca; Cervone, A.; Tortoreto, G.; Stepanov, E. A.; Fabbrizio, G. D.; Riccardi, G.
					
	Citazione
	
				Active annotation: Bootstrapping annotation lexicon and guidelines for supervised NLU learning / Marinelli, Franca; Cervone, A.; Tortoreto, G.; Stepanov, E. A.; Fabbrizio, G. D.; Riccardi, G.. - (2019), pp. 574-578. (Intervento presentato al  convegno 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 tenutosi a Graz nel 15th-19th September 2019) [10.21437/Interspeech.2019-2537].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
IS19-ActiveAnnotation.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 100.92 kB Formato Adobe PDF Visualizza/Apri	100.92 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/250240

Citazioni

ND

3

3

ND

social impact