Cross-Lingual Textual Entailment and Applications

Mehdad, Yashar

doi:10.15168/11572_369166

Textual Entailment (TE) has been proposed as a generic framework for modeling language variability. The great potential of integrating (monolingual) TE recognition components into NLP architectures has been reported in several areas, such as question answering, information retrieval, information extraction and document summarization. Mainly due to the absence of cross-lingual TE (CLTE) recognition components, similar improvements have not yet been achieved in any corresponding cross-lingual application. In this thesis, we propose and investigate Cross-Lingual Textual Entailment (CLTE) as a semantic relation between two text portions in dierent languages. We present dierent practical solutions to approach this problem by i) bringing CLTE back to the monolingual scenario, translating the two texts into the same language; and ii) integrating machine translation and TE algorithms and techniques. We argue that CLTE can be a core tech- nology for several cross-lingual NLP applications and tasks. Experiments on dierent datasets and two interesting cross-lingual NLP applications, namely content synchronization and machine translation evaluation, conrm the eectiveness of our approaches leading to successful results. As a complement to the research in the algorithmic side, we successfully explored the creation of cross-lingual textual entailment corpora by means of crowdsourcing, as a cheap and replicable data collection methodology that minimizes the manual work done by expert annotators.

Cross-Lingual Textual Entailment and Applications / Mehdad, Yashar. - (2012), pp. 1-205. [10.15168/11572_369166]

Cross-Lingual Textual Entailment and Applications

Mehdad, Yashar

2012-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2012
			
	Ciclo
	
				XXIV
			
	Anno Accademico
	
				2011-2012
			
	Dipartimento
	
				Ingegneria e Scienza dell'Informaz (cess.4/11/12)
			
	Corso di dottorato
	
				Informatica e telecomunicazioni (fino a.a. 2020-21, 36° ciclo)
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Federico, Marcello
			
	Supervisore/Relatore di tesi esterno (External supervisor)
	
				Negri, Matteo
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_369166
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
submitted_to_committee.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.96 MB Formato Adobe PDF Visualizza/Apri	3.96 MB	Adobe PDF	Visualizza/Apri