Recent neural network approaches to sentence matching compute the probability of two sentences being similar by minimizing a logistic loss. In this paper, we learn sentence representations by means of a siamese network, which: (i) uses encoders that share parameters; and (ii) enables the comparison between two sentences in terms of their euclidean distance, by minimizing a contrastive loss. Moreover, we add a multilayer perceptron in the architecture to simultaneously optimize the contrastive and the logistic losses. This way, our network can exploit a more informative feedback, given by the logistic loss, which is also quantified by the distance that the two sentences have according to their representation in the euclidean space. We show that jointly minimizing the two losses yields higher accuracy than minimizing them independently. We verify this finding by evaluating several baseline architectures in two sentence matching tasks: question paraphrasing and textual entailment recognition. Our network approaches the state of the art, while being much simpler and faster to train, and with less parameters than its competitors.
Accurate Sentence Matching with Hybrid Siamese Networks / Nicosia, Massimo; Moschitti, Alessandro. - ELETTRONICO. - (2017), pp. 2235-2238.
|Titolo:||Accurate Sentence Matching with Hybrid Siamese Networks|
|Autori:||Nicosia, Massimo; Moschitti, Alessandro|
|Autore/i del libro:||Massimo Nicosia and Alessandro Moschitti|
|Titolo del volume contenente il saggio:||Proceedings of the 2017 ACM on Conference on Information and Knowledge Management|
|Luogo di edizione:||New York NY, USA|
|Casa editrice:||ACM Digital Library|
|Anno di pubblicazione:||2017|
|Citazione:||Accurate Sentence Matching with Hybrid Siamese Networks / Nicosia, Massimo; Moschitti, Alessandro. - ELETTRONICO. - (2017), pp. 2235-2238.|
|Appare nelle tipologie:||02.1 Saggio su volume miscellaneo o Capitolo di libro (Essay or Book Chapter)|
File in questo prodotto:
|2017_CIKM_Moschitti_Siamese.pdf||Post-print referato (Refereed author’s manuscript)||Tutti i diritti riservati (All rights reserved)||Open Access Visualizza/Apri|
|3132847.3133156.pdf||Versione editoriale (Publisher’s layout)||Tutti i diritti riservati (All rights reserved)||Administrator|